Aakriti Poudel
  • Home
  • About
  • Resources
  • Posts

On this page

  • Introduction
  • Background
  • Directed Acyclic Graph (DAG)
  • Data Description
  • Hypotheses Statement
  • Data Simulation
  • GDP and Emissions Data

Global Emissions and GDP

MEDS
R
environment modeling
statistics
data science
Exploring the decoupling effects accross countries
Author

Aakriti Poudel

Published

December 3, 2025

Introduction

Can a country reduce emissions while growing its GDP? This question has gained significant attention in recent years as all countries globally are trying to achieve the economic development with maintaining environmental sustainability. In this assignment, we examine global power-sector emissions and GDP data from more than 200 countries to explore whether sustained economic growth can occur alongside meaningful emission reductions.

Background

Economic growth and greenhouse gas emissions have traditionally been closely linked. The increase in GDP is often accompanied by higher emissions rate due to higher energy consumption, increasing industrial activities and infrastructure growth. However, the recent trend shows that some countries are beginning to break this pattern. By analyzing emissions alongside GDP, we can identify which countries are achieving ‘decoupling’ where economic growth increases without a corresponding rise in emissions. This decoupling can be a result of combination of multiple factors such as clean energy, improvements in energy efficiency, technological innovation and effective climate policies. The study of these trends provides valuable insights into how countries can pursue sustainable development, balancing economic progress with environmental responsibility. It can serve as a guide for other nations who are aiming to reduce their carbon footprint while maintaining their economic growth.

Directed Acyclic Graph (DAG)

The following DAG describes the key relationships between emissions and GDP. It shows how the Economy (EC), Energy use (EU), Technology (T) and Governance (G) interact to shape both Emissions quantity (EQ) and GDP. The economy influences energy use, technological development, emissions and GDP. Energy use also contributes to technological change, which then affects both emissions and GDP. Governance has a direct impact on GDP as well. Overall, the DAG highlights the direct and indirect connections among these variables and helps identify which variable need to be controlled in the analysis to reduce confounding and selection bias.

Variables Abbreviation
Economy EC
Emissions Quantity EM
Energy use EU
Technology T
Governance G
GDP Growth GDP
Code
dag <- dagitty("
dag {
  EC -> EQ
  EC -> EU -> T
  EC -> GDP
  EU -> T
  T -> EQ
  T -> GDP
  G -> GDP
}
")

ggdag(dag, text = TRUE) +
  theme_void()

Data Description

Here, we will work with the following data sets:

GDP data

The data set comes from the World Bank and uses the indicator NY.GDP.MKTP.CD, which reports national gross domestic product (GDP) in current U.S. dollars. It measures the total value of all goods and services produced in a country in a given year, converted into U.S. dollars using that year’s official exchange rate. As it is a ‘current’ GDP, the values reflect the prices and exchange rates of the respective year, not adjusted for inflation or purchasing-power differences.

Global emissions data

This data set is retrieved from Climate TRACE. It provides a detailed, open-access global inventory of greenhouse gas and air pollution emissions. It aggregates data from hundreds of millions of emission sources worldwide including power plants, industrial facilities, transportation, agriculture, and more, allowing emissions to be broken down by country, sector, sub-sector and time period. The data cover emissions over multiple years, starting from 2015 for annual country-level data, and with monthly and source level records available since 2021 .

In this analysis, we focus specifically on the power sector data, which provides detailed emissions estimates, allowing us to examine trends and impacts within one of the largest sources of global greenhouse-gas emissions.

Hypotheses Statement

Question: Is it possible to achieve emission reductions while maintaining GDP growth?

H0: There is no significant relationship between GDP growth and emission reductions.

HA: There is significant relationship between GDP growth and emission reductions.

Data Simulation

In this project, we will examine whether it is possible for countries to achieve emission reductions while maintaining GDP growth. To evaluate this relationship, one variable must increase while the other decreases. Additionally, the global emissions and GDP data exhibit clear signs of over dispersion, making the Negative Binomial Model an appropriate choice for analysis.

Before fitting the actual data to the model, we will simulate data to conform the Negative Binomial Model. This will help us understand the model and ensure that it behaves as expected prior to applying it to real world data set.

The statistical notation for the Negative Binomial Model.

\[ \begin{align} \text{BinaryOutcome} &\sim NegativeBinomial(\mu, \theta) \\ log(\mu) &= \beta_0 + \beta_1 \text{Predictor} \end{align} \\ \]

Fit a logistic model like this:

model <- glm.nb(binary_outcome ~ predictor, data = my_data)

Set up

Code
# Load all necessary libraries for data analysis
library(tidyverse)
library(janitor)
library(MASS)
library(modelsummary)

Set parameters and test model

Code
# Set the parameters
n <- 10000 # Data set size
beta0 <- 0.6
beta1 <- 1.8
r <- 2 # Dispersion

# Uniform distribution between 1 and 10
x <- runif(n, min = 1, max = 10)

# Calculate mean (mu) using the Negative Binomial link function (log link)
mu <- exp(beta0 + beta1 * x)

# Simulate the dependent variable (y)
y <- rnbinom(n = n, mu = mu, size = r)

# Create a data set
my_data <- data.frame(x, y)

# Fit a negative binomial model
negbinomial <- glm.nb(y ~ x, data = my_data)

Visualize the model

Code
library(ggplot2)

# Create predictions from the model
my_data$prediction <- predict(negbinomial, type = "response")

# Simple scatter plot with fitted line
ggplot(my_data, aes(x = x, y = y)) +
  geom_point(alpha = 0.8, size = 0.2, color = "hotpink") +
  geom_line(aes(y = prediction), color = "darkgreen", linewidth = 1) +
  labs(title = "Negative Binomial Model",
       x = "X axis",
       y = "Y axis") +
  theme_minimal()

Let’s check the coefficients of this model.

Code
# Summarize the model in a table
negbinomial_summary <- list ("Negative Binomial" = glm.nb(y ~ x, data = my_data))
modelsummary(negbinomial_summary)
Negative Binomial
(Intercept) 0.573
(0.017)
x 1.804
(0.003)
Num.Obs. 10000
AIC 228619.8
BIC 228641.5
Log.Lik. -114306.923
RMSE 15841830.05

Here, let’s check the dispersion parameter that controls the amount of over dispersion in the data.

Code
# Calculate the dispersion parameter
estimated_theta <- negbinomial$theta
print(paste0("Estimated Theta: ", round(estimated_theta, 4)))
[1] "Estimated Theta: 1.9778"

GDP and Emissions Data

Read GDP and emissions data set

Code
# Read the data for power sector
electricity <- read_csv(here::here('posts', '2025-12-global-emissions-gdp', 'data', 'power', 'DATA', 'electricity-generation_country_emissions_v5_1_0.csv'),
                        na = c(" ", "0", "0.0", "NA")) %>%
  clean_names()

heat_plants <- read_csv(here::here('posts', '2025-12-global-emissions-gdp', 'data', 'power', 'DATA', 'heat-plants_country_emissions_v5_1_0.csv'),
                        na = c(" ", "0", "0.0", "NA")) %>% 
  clean_names()

other_energy <- read_csv(here::here('posts', '2025-12-global-emissions-gdp', 'data', 'power', 'DATA', 'other-energy-use_country_emissions_v5_1_0.csv'),
                        na = c(" ", "0", "0.0", "NA")) %>%
  clean_names()

# Read the GDP data set
gdp <- read_csv(here::here('posts', '2025-12-global-emissions-gdp', 'data', 'API_NY.GDP.MKTP.CD_DS2_en_csv_v2_280632', 'API_NY.GDP.MKTP.CD_DS2_en_csv_v2_280632.csv'),
                        na = c(" ", "NA"),
                skip = 4) %>%
  clean_names()

Data wrangling

Combine and clean power sub sector data for yearly analysis.

Code
# Combine all three sub sector data sets
power <- bind_rows(electricity, heat_plants, other_energy)

# Make a new column 'year' to segregate according to year
power_year <- power %>%
  mutate(year = year(power$start_time))

# Rename the column and filter year 2025
power_year_cleaned <- power_year %>%
  # Rename the column to match with gdp dataset
  rename(country_code = iso3_country) %>% 
  # Filter out the year 2025 (no data of 2025 available for GDP)
  filter(year != 2025)

Clean and reshape GDP data for further analysis.

Code
# Convert data to long format
gdp_long<- gdp %>%
  # Drop 'x70' column and columns from x1960 to x2014
  dplyr::select(-x70, -x1960:-x2014) %>% 
  pivot_longer(cols = starts_with("x"),
               names_to = "year",
               values_to = "gdp_values") %>% 
  # Remove 'x' from the 'year' string
  mutate(year = stringr::str_replace(year, pattern = "x", replacement = ""), 
  # Convert the resulting string to a numeric data type
  year = as.numeric(year))

Join power sub sector and GDP data for a streamlined structure and easy analysis.

Code
# Join the data frame and reorder the columns
power_gdp <- power_year_cleaned %>%
  left_join(gdp_long, by = c("country_code", "year")) %>%
  # Filter rows where 'country_name' column is empty
  filter(!is.na(country_name), country_name != "") %>%
  # Reorder the columns by position
  dplyr::select(13, 1, 12, everything())
Identify ten highest emissions countries
Code
# Calculate the mean emissions of countries for each year (2015 - 2024)
power_gdp_mean <- power_gdp %>%
  drop_na(emissions_quantity) %>%
  group_by(country_name, year) %>%
  summarize(mean = mean(emissions_quantity),
            .groups = 'drop') %>%
  arrange(desc(mean))

# Top 10 countries by overall mean emissions for all years
top_10_countries <- power_gdp_mean %>%
  group_by(country_name) %>%
  summarize(overall_mean = mean(mean), .groups = 'drop') %>%
  arrange(desc(overall_mean)) %>%
  slice_head(n = 10)

# Filter to keep only top 10 countries
power_gdp_mean_top10 <- power_gdp_mean %>%
  filter(country_name %in% top_10_countries$country_name)

Visualize the top ten highest emitting countries.

Code
 ggplot(power_gdp_mean_top10, aes(x = year, y = mean, color = country_name)) +
  geom_line() +
  scale_x_continuous(breaks = seq(2015, 2024, by = 1)) +
  scale_y_continuous(labels = scales::label_number(scale = 1e-6, suffix = "M t")) +
  labs(x = "Year",
       y = "CO2 emission per metric tonnes",
       title = "Top ten highest emitting countries from 2015 to 2024",
       color = "Country") +
  theme_classic()

Calculate GDP and emissions changes
Code
# Aggregate GDP and emissions by country and year
country_totals <- power_gdp %>%
  group_by(country_name, country_code, subsector, year) %>%
  summarize(total_emissions = sum(emissions_quantity, na.rm = TRUE),
            gdp_values = first(gdp_values),
            .groups = 'drop')

# Calculate annual changes in GDP and emissions
annual_changes <- country_totals %>%
  arrange(country_code, year) %>%
  group_by(country_name, country_code) %>%
  mutate(emissions_pct_change = (total_emissions - lag(total_emissions)) / lag(total_emissions) * 100,
         gdp_pct_change = (gdp_values - lag(gdp_values)) / lag(gdp_values) * 100) %>%
  ungroup()

Identify countries with increased GDP and decreased emissions

Code
# Find the cases where emissions decreased while GDP increased
decoupling <- annual_changes %>%
  filter(emissions_pct_change < 0 & gdp_pct_change > 0) %>%
  dplyr::select(country_name, subsector, year, emissions_pct_change, gdp_pct_change)

# Find countries where emissions decreased while GDP increased
decoupling_case <- decoupling %>%
  group_by(country_name) %>%
  summarize(decoupling_years = n(),
            avg_emission_reduction = mean(emissions_pct_change),
            avg_gdp_growth = mean(gdp_pct_change),
            .groups = 'drop') %>%
  arrange(desc(decoupling_years))
Visualize top ten countries with decoupling effects
Code
# Filter the top ten countries with decoupling effects
decoup_ten <- decoupling_case %>%
  arrange(desc(decoupling_years)) %>%
  slice_head(n = 10)

# Plot a graph for top ten countries
ggplot(decoup_ten, aes(x = reorder(country_name, decoupling_years), 
                       y = decoupling_years)) +
  geom_col(fill = "darkolivegreen4") +
  geom_text(aes(label = decoupling_years), hjust = -0.3, size = 3.5) +
  coord_flip() +
  theme_classic() +
  labs(x = "Top ten countries",
    y = "Number of years with decoupling",
    title = "Top 10 countries: Years of emission reduction while increasing GDP") +
  ylim(0, max(decoup_ten$decoupling_years) * 1.1)

Fit GDP and emissions data into a model

Code
# Fit the model
nb_model <- glm.nb(decoupling_years ~ avg_gdp_growth + avg_emission_reduction,
                data = decoupling_case)
Create prediction data for GDP and emissions
Code
# Predictions for GDP growth (holding emission reduction constant)
pred_gdp <- decoupling_case %>%
  mutate(avg_emission_reduction = mean(decoupling_case$avg_emission_reduction),
         variable = "GDP growth") %>%
  mutate(pred = predict(nb_model, newdata = .))

# Predictions for emission reduction (holding GDP growth constant)
pred_emission <- decoupling_case %>%
  mutate(avg_gdp_growth = mean(decoupling_case$avg_gdp_growth),
         variable = "Emission Reduction") %>%
  mutate(pred = predict(nb_model, newdata = .))

# Combine predictions for GDP and emissions
pred_combined <- bind_rows(pred_gdp, pred_emission)
Visualize the model prediction
Code
# Plot the GDP and emissions
ggplot() +
  geom_point(data = decoupling_case, 
             aes(x = avg_gdp_growth, y = decoupling_years, shape = "GDP growth"),
             size = 0.5, alpha = 0.7, color = "hotpink") +
  geom_point(data = decoupling_case, 
             aes(x = avg_emission_reduction, y = decoupling_years, shape = "Emissions reduction"), size = 0.5, alpha = 0.7, color = "darkgreen") +
  geom_line(data = pred_gdp,
            aes(x = avg_gdp_growth, y = pred, color = "GDP growth"),
            linewidth = 0.8) +
  geom_line(data = pred_emission,
            aes(x = avg_emission_reduction, y = pred, color = "Emissions reduction"),
            linewidth = 0.8) +
  scale_color_manual(values = c("GDP growth" = "hotpink",
                                "Emissions reduction" = "darkgreen"),
                     name = "Predictor") +
  scale_shape_manual(values = c("GDP growth" = 12, "Emissions reduction" = 12),
                     name = "Predictor") +
  theme_classic() +
  theme(legend.position = "right",
        plot.title = element_text(face = "bold", size = 14)) +
  labs(x = "Predictor value (%)",
       y = "Decoupling years",
       title = "A decade of decoupling: Rising GDP and Declining Emissions",
    subtitle = "Straight lne shows model predictions & Shape shows actual observations")

Citation

BibTeX citation:
@online{poudel2025,
  author = {Poudel, Aakriti},
  title = {Global {Emissions} and {GDP}},
  date = {2025-12-03},
  url = {https://aakriti-poudel-chhetri.github.io/posts/2025-12-global-emissions-gdp/},
  langid = {en}
}
For attribution, please cite this work as:
Poudel, Aakriti. 2025. “Global Emissions and GDP.” December 3, 2025. https://aakriti-poudel-chhetri.github.io/posts/2025-12-global-emissions-gdp/.

© 2025, Aakriti Poudel

 

This website is built with Quarto, R and GitHub.